AdaWorldAPI · 2026-03-29T20:56:33Z

No description provided.

…lette Extracted from openai-community/gpt2 model.safetensors (522.7 MB): wte.weight: [50257, 768] f32 → Base17 golden-step projection gpt2_base17_50k.bin: 1,669 KB (50,257 tokens × 34B = 320× compression) gpt2_palette_50k.bin: 58 KB (256 centroids + 50K assignments = 9,294×) GPT-2 uses SAME BPE tokenizer as Jina v4 → palette indices are DIRECTLY COMPATIBLE. CausalEdge64 edges from GPT-2 and Jina use the SAME S/P/O palette space. Zero mapping overhead. Total weights in ndarray/src/hpc/jina/weights/: 3.4 MB Jina v4 (20K tokens): 694 KB GPT-2 (50K tokens): 1,727 KB COCA vocabulary: 997 KB All three models share the same Base17 codec. Load via LazyLock at startup. No external deps. No GPU. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

…palette Extracted from google-bert/bert-base-uncased (420 MB safetensors): bert.embeddings.word_embeddings.weight: [30522, 768] f32 bert_base17_30k.bin: 1,013 KB (30,522 tokens × 34B = 424× compression) bert_palette_30k.bin: 38 KB (256 centroids + 30K assignments = 11,225×) BERT (WordPiece) uses DIFFERENT tokenizer than GPT-2/Jina (BPE). Needs mapping table for cross-model palette alignment. BERT captures BIDIRECTIONAL context — complements GPT-2 autoregressive. Total weights: 4.5 MB (3 models + COCA vocabulary) Jina v4: 694 KB (20K tokens, 2048D→17D) GPT-2: 1,727 KB (50K tokens, 768D→17D) BERT: 1,052 KB (30K tokens, 768D→17D) COCA: 997 KB (20K academic vocabulary) All three models: same Base17 codec, same palette format, same CausalEdge64. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

…odec runtime.rs (300 lines, 11 tests): LazyLock<ModelRuntime> for Jina v4, GPT-2, BERT: Weights embedded via include_bytes! (zero file I/O) Loaded once, used forever. 4.5MB total in binary. Full codec chain per model: HHTL cascade: heel_distance() → cascade_distance() (early exit) SimilarityTable: heel_similarity() → calibrated f32 [0,1] CAM-PQ: cam_fingerprint() → 6-byte [palette, dim0..4] CausalEdge64: pack_spo_edge() → u64 with NARS + Pearl + temporal Base17 LEAF: leaf_distance() → full resolution L1 SimilarityTable built from EXACT 256×256 palette distance CDF. This IS the bgz17 pattern: empirical distribution → calibrated lookup. Usage: use ndarray::hpc::jina::runtime::{JINA, GPT2, BERT}; let sim = GPT2.heel_similarity(token_a, token_b); // O(1), calibrated let edge = GPT2.pack_spo_edge(s, p, o, 0.8, 0.6, 42); // CausalEdge64 let fp = BERT.cam_fingerprint(token); // 6-byte CAM-PQ 22 tests passing (12 codec + 11 runtime). https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

GPT-2 small (124M) forward pass with KV cache, all transcendentals via crate::simd::F32x16 (LayerNorm, GELU, softmax, dot products). - weights.rs: safetensors loader for 12 transformer layers - inference.rs: autoregressive generation with temperature sampling - api.rs: OpenAI-compatible request/response types (/v1/completions, /v1/embeddings, /v1/models) — transport-agnostic - 9 tests passing (layer_norm, GELU, softmax, config, API types) https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Weight matrices pre-transposed from [in_dim, out_dim] to [out_dim, in_dim] during safetensors loading. matmul_vec_simd now reads contiguous rows via F32x16::from_slice + mul_add — full SIMD utilization (768D = 48 × F32x16). https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Full integration with the tensor codec pipeline: - AttentionTable: palette-based O(1) approximate attention via jina::runtime::GPT2 (256×256 HEEL distance table) - CausalEdge64 emission: attention patterns packed as SPO edges with NARS truth values (subject=query, predicate=head, object=key) - HHTL cascade: token_similarity(), token_distance_leaf(), token_distance_cascade() methods on Gpt2Engine - CAM-PQ: 6-byte token fingerprints via cam_fingerprint() Both features are opt-in flags (use_attention_table, emit_causal_edges) to avoid overhead when not needed. 14 tests passing. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

Extract shared code into hpc/models/: - safetensors.rs: generic file loader (used by GPT-2, SD, BERT) - layers.rs: SIMD ops (layer_norm, gelu, silu, group_norm, softmax, matmul_vec, dot_product) — all via crate::simd::F32x16 - api_types.rs: OpenAI-compatible envelope (Usage, FinishReason, etc.) Add hpc/stable_diffusion/ scaffold (code only, no weights): - clip.rs: CLIP text encoder (same transformer as GPT-2, shared layers) - unet.rs: UNet denoiser with Conv2D, GroupNorm, SiLU, timestep embedding - vae.rs: VAE decoder (latent→RGB) - scheduler.rs: DDIM noise scheduler with precomputed alpha schedule - weights.rs: safetensors loader for SD CLIP weights - api.rs: /v1/images/generations with full pipeline 52 tests passing. Zero weight files — disk space conscious. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

…iLU) OpenChat 3.5 / Mistral-7B architecture, fully distinct from GPT-2: - GQA: 32 query heads share 8 KV heads (4:1 ratio, 75% KV cache savings) - RoPE: rotary positional embedding (no learned positions) - RMSNorm: simpler norm without mean subtraction (both in models::layers) - SiLU: gated MLP (gate * up → down) with F32x16 element-wise SIMD - GGUF weight loading via hpc::gguf (Q4_K_M + Q4_0 dequantization added) - CausalEdge64 emission from attention patterns - OpenChat chat template (GPT4 Correct User/Assistant markers) - /v1/chat/completions API types All ops through crate::simd::F32x16 via models::layers. No weights stored — loaded at runtime from user-provided GGUF. 15 tests passing. 77 total across new modules. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7

claude added 8 commits March 29, 2026 20:02

AdaWorldAPI merged commit ffe89e3 into master Mar 29, 2026
4 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdaWorldAPI commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants